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Abstract 

Shape analysis is a promising technique for statically ver- 
ifying and extracting properties of programs that manip- 
ulate complex data structures. We introduce a new char- 
acterization of constraints that arise in parametric shape 
analysis based on manipulation of three-valued structures 
as dataflow facts. 

We identify an interesting syntactic class of first-order 
logic formulas that captures the meaning of three-valued 
structures under concretization. This class is broader than 
previously introduced classes, allowing for a greater flex- 
ibility in the formulation of shape analysis constraints in 
program annotations and internal analysis representations. 
Three-valued structures can be viewed as one possible nor- 
mal form of the formulas in our class. 

Moreover, we characterize the meaning of three-valued 
structures under "tight concretization". We show that the 
seemingly minor change from concretization to tight con- 
cretization increases the expressive power of three-valued 
structures in such a way that the resulting constraints are 
closed under all boolean operations. We call the resulting 
constraints boolean shape analysis constraints. 

The main technical contribution of this paper is a natu- 
ral syntactic characterization of boolean shape analysis con- 
straints as arbitrary boolean combinations of first-order sen- 
tences of certain form, and an algorithm for transforming 
such boolean combinations into the normal form that corre- 
sponds directly to three- valued structures. 

Our result holds in the presence of arbitrary shape anal- 
ysis instrumentation predicates. The result enables the re- 
duction (without any approximation) of the entailment and 
the equivalence of shape analysis constraints to the satisfia- 
bility of shape analysis constraints. When the satisfiability 
of the constraints is decidable, our result implies that the 
entailment and the equivalence of the constraints are also 
decidable, which enables the use of constraints in a compo- 
sitional shape analysis with a predictable behavior. 
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1 Introduction 

Dynamically Allocated Data Structures Modern 
software is becoming increasingly complex. This complexity 
corresponds to the complex web of relationships between 
different program entities. Object-oriented programming 
languages such as Java use references between objects dy- 
namically allocated in the heap to model the relationships 
between entities of the application domain. Dynamic allo- 
cation of objects provides flexibility that helps applications 
adapt to the dynamically changing environment. To model 
the evolution of the relationships between objects, applica- 
tions perform destructive updates of the heap. Because writ- 
ing applications in this programming model is error-prone, 
tools for statically verifying partial correctness of such pro- 
grams are very valuable. 

Shape Analysis Shape analysis techniques (49] , [211 1201 
143] can verify and derive precise properties of objects in the 
heap. Shape analysis therefore appears essential for reason- 
ing about programs written in modern imperative program- 
ming languages. Shape analysis is promising as a general- 
purpose verification technique, because of its ability to rea- 
son about graphs as general structures, and the ability to 
summarize properties of unbounded sets of objects. Shape 
analysis such as [35] is effective in deriving program proper- 
ties at each program point and synthesizing loop invariants 
while maintaining high precision and strong soundness guar- 
antees. 

Program Specifications The ability to write program 
specifications can greatly improve the effectiveness of shape 
analysis (and, for that matter, the effectiveness of any static 
or dynamic analysis in general). First of all, specifications 
indicate the desired property to be verified. Next, specifi- 
cations allow the use of assume/guarantee reasoning, which 
improves the scalability of the analysis and enables its ap- 
plication to reusable program components. Finally, if neces- 
sary, specifications can guide the static analysis and provide 
hints for it, while at the same time leaving a documentation 
trace explaining the correctness of the program. 

Analysis-Specification Gap The representation of pro- 
gram properties used by the program analysis is often differ- 
ent from the representation of program properties that is ap- 
propriate for program annotations. To synthesize invariants 
using a fixpoint computation, program analysis often uses a 
finite lattice of program properties. On the other hand, pro- 
gram annotations should be expressed in some convenient, 
well-known notation, such as a variation of first-order logic. 
A program analysis that utilizes program specifications must 
bridge the gap between the analysis representation and the 
program annotations. 

Logic-Based Shape Analysis A promising shape anal- 
ysis approach [49] based on abstract interpretation [14] uses 
the lattice of three- valued logical structures for fixpoint com- 
putation. The fact that the approach is based on logic 
makes bridging the gap between the program annotations 
and the analysis representation much easier, yet it does not 
eliminate it entirely. The original TVLA system [38] uses 
three-valued structures to specify preconditions, which cor- 
responds to specifications with disjoint, non-empty sets of 



objects and is sometimes unnecessarily verbose. The fol- 
lowup work [461 [52] shows how to use arbitrary first-order 
formulas for program annotations and convert the anno- 
tations to three-valued structures using a theorem prover. 
Because the first-order logic is undecidable in general, it is 
interesting to consider alternative approaches with a poten- 
tially more predictable behavior. 

1.1 Contributions 

Mediating the Analysis-Specification Gap This pa- 
per addresses the gap between program annotations and 
three- valued structures by providing an algorithm for trans- 
forming annotations (expressed as formulas) into three- 
valued structures, as well as a way of viewing a class of 
canonical program annotations as three-valued structures. 
Because we restrict our attention to formulas of a particular 
form, we are able to find a complete and sound algorithm 
for generating three-valued structures. The completeness 
makes our algorithm potentially more predictable than the 
use of theorem provers on arbitrary formulas. Our algorithm 
shows that the expressive power of our specifications is equal 
to the expressive power of three-valued structures. Never- 
theless, our specifications may use sets that are potentially 
intersecting or empty, which makes the annotations more 
flexible than three- valued structures themselves where sum- 
mary nodes represent only disjoint sets of nodes. Moreover, 
the characterization of existing shape analysis constraints 
as disjunctive normal forms of formulas suggests that al- 
ternative representations for three- valued structures may be 

possible Unsung. 

The characterization of three-valued structures by for- 
mulas allows us to easily prove properties that are less ob- 
vious in the three-valued structure view, such as closure 
of three-valued structures under conjunction. To compute 
the conjunction of three-valued structures, we use the fact 
that three-valued structures correspond to disjunctive nor- 
mal forms of positive boolean combinations of formulas; the 
computation of the conjunction of three-valued structures 
then corresponds to a transformation of a conjunction of 
two disjunctive normal forms into a new disjunctive normal 
form. 

Boolean Shape Analysis Constraints By considering 
the "tight concretization" semantics instead of the con- 
cretization semantics of three- valued structures, we obtain a 
richer class of formulas, namely the class of all boolean com- 
binations of certain atomic formulas. This characterization 
implies that three- valued structures under concretization are 
closed under all boolean operations. We therefore call the 
constraints arising from tight concretization of three- valued 
structures boolean shape analysis constraints. 

Although the notion of tight concretization is not new, 
the characterization of boolean shape analysis constraints 
as boolean combinations of certain formulas is surprisingly 
elegant and has not been observed before. 

Consequences of Boolean Closure The resulting clo- 
sure properties of boolean shape analysis constraints have 
several potential uses. The closure under disjunction is nec- 
essary for fixpoint computation in dataflow analysis and 
can be conveniently computed even for shape analysis con- 
straints; what our results show is that boolean shape anal- 



ysis constraints are also closed under conjunction and nega- 
tion. 

The conjunction of constraints is needed, for example, in 
compositional interprocedural shape analysis, which com- 
putes the relation composition of relations on states. Con- 
junction allows the analysis to simultaneously retain the 
call-site specific information that the callee preserves across 
the call, and the postcondition which summarizes the ac- 
tions of the callee. 

The negation of constraints is useful for expressing de- 
terministic branches in control-flow graphs. For example, an 
if statement with the condition c results in conjoining the 
dataflow fact d to yield d A c in the if branch, and d A -^c in 
the else branch. Similarly, the assert(c) statement, which 
is an important mechanism for program specification, has 
(in the relational semantics) the condition ^c for the branch 
which leads to an error state. 

Finally, the closure under negation implies that both 
the implication and the equivalence of shape analysis con- 
straints are reducible to the satisfiability of shape analysis 
constraints. This result is in contrast to "regular graph con- 
straints" of [35], which have a decidable satisfiability prob- 
lem but undecidable implication and the equivalence prob- 
lems. The entailment problem is also important for composi- 
tional analysis which uses assume/guarantee reasoning. By 
introducing history variables that store the initial state of 
the program, a compositional interprocedural shape analysis 
can use shape analysis constraints to represent relations on 
program states. The fundamental operations of such compo- 
sitional shape analysis are computation of the best approx- 
imation of relation composition and checking the subset of 
relations. Closure under boolean operations allows reduc- 
ing all these operations to the satisfiability of shape analysis 
constraints. 

Scope of the Result Our result is relevant in the pres- 
ence of shape analysis instrumentation predicates defined 
using arbitrary first-order formulas. What the particular 
choice of instrumentation predicates determines is whether 
the satisfiability problem for boolean shape analysis con- 
straints is decidable. If the satisfiability problem for shape 
analysis constraints with a particular choice of instrumenta- 
tion predicates is decidable, our closure results imply that 
the entailment problem is also decidable, and that the con- 
straints are suitable for use in an instantiation of the shape 
analysis framework. 

Summary of contributions We can summarize the con- 
tributions of this paper as follows: 

1. We give a concrete example that shows how elements 
of the lattice for fixpoint computation can be viewed as 
formulas in a canonical form; we believe that this idea 
is useful in general. 

2. We identify a syntactic class of formulas whose ex- 
pressive power matches exactly the semantics of three- 
valued structures under concretization. The resulting 
constraints are closed under disjunction and conjunc- 
tion, but are not necessarily closed under negation. 

3. We identify a syntactic class of formulas whose ex- 
pressive power matches exactly the semantics of three- 
valued structures under tight concretization. The re- 
sulting boolean shape analysis constraints are closed un- 



der all boolean operations such as disjunction, conjunc- 
tion, negation, implication, and equivalence. 

4. We observe that the closure under all boolean opera- 
tions allows reducing the entailment and the equiva- 
lence problems to the satisfiability problem of boolean 
shape analysis constraints. 

5. We show that each three-valued structure has a model 
within the set of two-valued structures, which means 
that the satisfiability problem of shape analysis con- 
straints is trivial over the set of all two-valued struc- 
tures. 

6. We show that, even in the presence of instrumentation 
predicates, our results allow reducing the entailment 
and the equivalence problems of shape analysis con- 
straints to the satisfiability problem. 

1.2 Organization of the Paper 

The rest of the paper is organized as follows. Section [2] 
reviews the basic notions of two-valued and three-valued 
structures. Section [3] presents a series of syntactic classes 
of formulas of equal expressive power that all characterize 
the meani ng o f two-valued structures under concretization 
(Corollary 28 1. As a consequence, we derive the closure 



of c onst raints under disjunction and conjunction (Corol- 
lary |29[ ). Section |3| is to some extent a preparation for 
Section [4] Section|4] introduces a series of f ormulas that 
have the same expressive power (Corollary p5| as the three- 
valued structures under tight concretization (Definition 30 1, 
and introd uces the name bool ean shape analysis constraints 
(Definition 461. Section 4.1 observes that boolean shape 



analysis constraints are closed under all boolean operations 
and der ives some consequences of these closure properties. 
Section |4.2| shows that boolean shape analysis constraints 
are the smallest extension of three-valued structures un- 
der concretization wh ich is clo sed u nder all boolean opera- 
tions (Proposition 511. Section |4.3| shows how to transform 



a three-valued structure into a structure where all unary 
predicates have definite values. Section [5] introduces the de- 
cidability problems for three-valued structures, shows that 
every three-valued structure is satisfiable, and derives the 
decidability of the implication and the equivalence as a con- 
sequence of the decidability of satisfiability and the closure 
under boolean operations. Section [6] generalizes the results 
of the previous sections to the case when the values of some 
predicates are constrained by first-order formulas. Section[7| 
presents the related work and Section [8] concludes. 

2 Preliminaries 

In this section we define some preliminary concepts used 
throughout the paper. We mostly follow the setup of [49] 
and for completeness repeat some of the definitions from 

gSEl]. 

Let A be a finite set of unary relation symbols (with a 
typical element A £ A) and T a finite set of binary relation 
symbols (with a element / G T) . For simplicity, we consider 
only unary and binary relation symbols because they appear 
to be the most useful cases. Most of our results generalize 
naturally to n-ary relations. 



Two- Valued Structures We next introduce two-valued 
structures. A two-valued structure consists of a domain £7" 
and the interpretation l" of relation symbols. Our language 
does not contain function symbols because we represent all 
functions as relations. In model theory and logic, a two- 
valued structure corresponds to a structure (model) whose 
domain [/" is finite. 

Definition 1 A two-valued structure is a pair S" = (U, t") 
where U" is a finite non-empty set (of "concrete individu- 
als"), l s (A) G U S -> {0, 1} for AeA, and i}(f) G (U*) 2 -*■ 
{0, 1} for f 6 T. Let 

2-STRUCT = {5* j S l = {U^l 9 } is a two-valued structure} 

In program analysis, each two-valued structure represents a 
state of the program. The use of structures for representing 
program state has proven useful in the shape analysis [49], 
Abstract State Machines [7], the Alloy modelling language 
and analyzer [27] . and relational databases [131115] . 

Three- Valued Structures A three-valued structure is a 
model for Kleene's three-valued logic [321 144] and differs 
from two-valued structure by the fact that predicates can 
have three-possible values: {0}, {1}, and {0, 1}. (The truth- 
values {0}, {1}, {0, 1} of three- valued logic are denoted by, 
respectively, 0, 1, 1/2 in [49].) 

Definition 2 A three-valued structure is a pair S — {U, t) 
where U is a finite non-empty set (of "abstract individuals"), 
l(A) £[J-t {{0},{1},{0, 1}} for A £ A and and t,(f) G 
U 2 ^{{0},{l},{0,l}}forf£T. Let 

3-STRUCT = {S j S = (£7,i) is a three-valued structure} 

The parametric shape analysis framework [?§] uses three- 
valued structures to specify sets of two-valued structures 
according to Definition [4] below. 

Formulas We assume the usual syntax and semantics of 
first-order logic. We use an abstract view of the syntax 
of formulas in first-order logic which takes into account as- 
sociativity, commutativity, and idempotence of conjunction 
and disjunction, and the property -^^p — p. A conjunction 
with zero conjuncts denotes true; a disjunction with zero 
disjuncts denotes false. 

If S' is a two-valued structure and F a formula with free 
variables Xi, . . . ,X n and u\, . . . ,u„ G S, then e = [ii n 
u\, . . . ,x n i— » u„] denotes an environment mapping Xi to u\ 
for all 1 < i < n, and ([Fj s e) denotes the value v G {0, 1} 
of the formula F in the model S' under the environment 
e. Instead of ([^(a;)]" 3 [a; i— > «']) we sometimes write S* \— 
F(u') and omit S' if it is understood from the context. If F 



has no free variables we denote the truth 
is* 



value v 



of F in 



simply by \F} S and write S" 8 |= F for \F} S =1. Definition^ 
below defines the set of models of a formula in the expected 
way. 

Definition 3 (Models of sets of Formulas) Let F be a 

first-order formula. Then 



7JK-F) = {S* G 2-STRUCT | \F\ S> 
If C is a set of formulas, define 

models[C] = {t£(F) F G C*} 



1} 



The transitive closure operator or inductive definitions 
are useful for describing instrumentation predicates (Sec- 
tion [m, but the presence of such constructs in logic is largely 
orthogonal to the results of this paper. 

For simplicity we treat equality like any other binary 
relation symbol and do not treat summary nodes specially, 
but our results are also useful in the presence of summary 
nodes (see [36], as well as [44] and Section rob. 

3 Three- Valued Structures with Concretization 

This section uses first-order formulas to characterize the 
meaning of two-valued structures under the usual con- 
cretization function. Section [4] presents an alternative se- 
mantics using tight concretization, which yields constraints 
with better closure properties. 

The following notion of concretization corresponds to [481 
Definition 3.5]. The concretization function 7* provides the 
semantics for sets of three-valued structures. 

Definition 4 (Homomorphism and Concretization) 

Let S s = (£/',(.') be a two-valued structure, S = (U,l) a 
three-valued structure, and h : £/" — > U a surjective total 
function. We write S 8 C h S, iff 

1. for every A £ A and u G U : 

l{A){u) D {^{A)^) I h(J) =u} 

2. for every f G J- and Ui,Ui G U: 

t(/)(«i,«9) 2 { ^(/XW.W) 1 

7i(u-i lt ) =tti A /i(u 2 lt ) =u 2 } 

We write £r C S iff there exists a surjective total function 
h such that S" c' 1 S. We call any such h homomorphism 
from S" to S. The concretization of a three-valued structure 
S, denoted j(S), is given by: 

j(S) = {5* I S* C S} 

We extend 7 to 7* acting on sets of three-valued structures 
so that the set denotes a disjunction: 



7* (5)= U 7 (S) 



ses 

The function h from Definition H] is called "embedding" in 
[49] . (We choose to call h "homomorphism" because in lit- 
erature the term "embedding" sometimes implies injectivity 
whereas in shape analysis h is not required to be injective, 
and almost never is injective.) 

Bounded Structures Each set of three- valued structures 
S specifies a set of heaps 7* (5). Each such set 7* (S) 
is definable as the set of models of a formula in existen- 
tial monadic second-order logic; the second-order existen- 
tial quantification arises from the existential quantification 
over the homomorphisms h. Constraints that involve unre- 
stricted second-order existential quantifications have several 
undesirable properties [351 134] . We therefore restrict our at- 
tention to bounded structures, where the homomorphism h is 
determined as the natural map associated with the partition 



of the elements of U' according to the values some chosen 
finite set of predicates. 

For the purpose of this paper, we define bounded struc- 
tures as follows. Let Ay C A be a finite subset of unary 
predicates. We call elements of Ay abstraction predicates. 

Definition 5 (Bounded Structure) We say that a 
three-valued structure S — (U,i) is Ai-bounded iff both of 
the following two conditions hold: 

1. l(A)(u) G {{0}, {f}} for all A £ Ay and all u G U; 

2. ifui,U2 G U and ui / u 2 then i(A)(uy) / t,(A)(u%) for 
some A G Ay. 

Definition 6 (Concretization Definability) The set of 

sets of heaps definable via three-valued structures with con- 
cretization is defined by: 



modelspl] = {7* (5) j S a finite set of Ai -bounded 
three-valued structures } 

Note that we use the same notation modelsLY] when X de- 
notes a set of structures (Definition p} and when X denotes 
a set of formulas (Definition 13}. There is no confusion be- 
cause we use distinct names for sets of structures and sets 
of formulas. 

We proceed to characterize the set models[Ti] as the set 
of models of formulas of a certain form. 

We define the notion of a cube first. 

Definition 7 (Exponent Notation) If A £ A and a G 

{0, 1} then A a is defined by A 1 - A and A = -.A. 

Definition 8 (Cube) A cube over Ay (or just "cube" for 
short) is an expression P{x) of the form 

A^{x)A...AA^ q (x) 

where ai, ■ ■ ■ ,a q G {0, 1}. 

i?i-literals are the building blocks for formulas used to 
form constraints that characterize models[Ti]. 

Definition 9 (_Ri-literal) Let Pi(x),P<2(y) range over 
cubes over Ay, let A range over elements of A\ Ai, and 
let f range over T . 

An Ri-literal is a formula of one of the following forms: 



3x. Pi(x) 


node present 


->3x. Py{x) 


node absent 


-i3x. Pi(x) AA(x) 


property does not hold 


-Gas. Pi(x) A-iA(x) 


property holds 


-,3x3y. P 1 (x)AP 2 (y)Af{x,y) 


no edge 


-,3x3y. Py(x) A P 2 (y) A -,/(z, y) 


must edge 



We first introduce the class of _Ri-formulas that satisfy 
syntactic invariants that make them isomorphic to three- 
valued structures. 

Definition 10 (i?i-formulas) Let P(x), Pi{x), P 2 {y) de- 
note cubes over Ai. A canonical conjunction of Ry literals is 
a conjunction of Ri -literals that satisfies the following con- 
ditions: 

1. for each P(x) a cube over Ai, exactly one of the con- 
juncts 3x.P(x) and ->3x.P(x) occurs in the conjunc- 
tion: 



2. there is at least one cube P(x) such that the conjunct 
3x.P(x) occurs in the conjunction: 

3. if the conjunct Sx.P(x) occurs, then this conjunct is 
the only occurrence of the cube P(x) (and the cube 
P(y)) in the conjunction; 

4. for each cube P(x), and A 6 A\ Ay, at most one of 
the conjuncts -^3x.P(x) A A(x) and ->3x.P(x) A ->A(x) 
occur; 

5. for every two cubes Py(x) and P 2 (y), at most one of 
the conjuncts 



and 



-^3x3y. P 1 (x)AP 2 (y)Af(x,y) 



-a X 3y. Py(x) A P 2 (y) A -<f(x,y) 



Define an Ry-formula as any disjunction of canonical con- 
junctions of Ry -literals. 

In Definition [To] and throughout the paper, the symbol Ry 
alone denotes R 1 -formulas, so models[i?i] is the set of all 
models of all _Ri-formulas (as opposed to, for example, the 
set of models of all R 1 -literals). 

The following Proposition [TT] shows that three-valued 
structures and _Ri-formulas define same sets of two-valued 
structures. The proof of Proposition [TT] is straightforward 
because the set of Ry formulas was chosen to facilitate the 
proof. The proof shows that there is a semantic-preserving 
bijection between three- valued structures and canonical con- 
junctions of Ry -literals. 

Proposition 11 models[7?i] = models[Ti] 

Proof. The idea of the proof is the following. Each 
bounded three-valued structure can be represented as a 
canonical conjunction of _Ri-literals, and each canonical con- 
junction of i?i-literals can be represented as a bounded 
three-valued structure. Therefore, disjunctions of canonical 
conjunctions of i?i-literals correspond to sets of bounded 
three- valued structures. 

We next give a function fi mapping each bounded three- 
valued structure S to a canonical conjunction of i?i-literals 
H(S). We show that S and fi(S) represent same set of two- 
valued structures. Moreover, each canonical conjunction of 
i?i-literals is equal to (i(S) for some three-valued structure 
S. 

Let S — {U, l) be an _4i-bounded three- valued structure. 
Define the formula fJ,(S) as the conjunction of the following 
_Ri -literals. 

Define first, for each u G U, a cube over Ay corresponding 
to u, denoted n(u)(x), by 

7r(u)(a0 = /\ A a(A) (x) 



where 



x(A) = 



1, if t (A)(u) = {l} 
0, if t(A)(u) = {0} 



a is well-defined because l(A) G {{0}, {1}} for A 6 Ay. We 
next introduce the J?i-literals. 



Node existence. For each u G U, introduce an i?i-literal 

3a;.7r(w)(a:) (1) 

For each remaining _4i-cube P(x), that is, for each cube 
P{x) such that ir(u)(x) ^ P(x) for all u G U, introduce an 
i?i-literal 

-^3x.P(x) (2) 

Node properties. Let u £ U and A G A \ A.\. If l(A)(u) — 
{!}, introduce the Ri -literal 



-i3x. 7r(u)(x) A -<A(x) 
If l(A)(u) = {0}, introduce the literal 

-*3x. n(u)(x) A A(x) 



(3) 
(4) 



If l(A)(u) = {0, 1}, we do introduce no conjuncts. 

Edges. Let Ui,U2 G U (we allow Ui = U2) and let / G T. If 

i (f)( u i-j u 2) — {!}, introduce the must-edge i?i-literal 

-^3x3y. k(ui)(x) A 7r(tte)(j/) A -./(a;, w) (5) 

If i(/)(wi,W2) = {0}, introduce the no-edge _Ri-literal 

^3z3y. 7r(ui)(a;) A w(u 2 )(y) A /(x,y) (6) 

If t(/)(w) = {0, 1}, we introduce no conjuncts. 

Define formula u(S) as the conjunction of all formulas 
PJ, Q, S, Q, &, (|6|, introduced as described above. 
We next show ^y F (fi{s)) — 7* (S). In both directions, we es- 
tablish the following property of the homomorphism h from 
S* to S: 

hiv?) = « iff 5* h tt(w)(w*) (7) 

Direction ^(jJ.(S)) D 7*(S). Let S 8 G 7*(S). Then S 8 C h 
5 1 for some homomorphism ft. We establish that |7| holds 
for ft. For A G Ai, we have {i 8 (u B )(,4)} = u{u){A), so 
|= A a(A) (ii 8 ). Therefore, |= 7r(w)(w 8 ), which establishes Q. 
We next show S" |= C for each conjunct C of fx(S). 

1) Consider C = 3x.n(u)(x) for some u. Because ft is a 
surjection, h(vr) — u for some u , so |= 7r(u)(ir), and \— C. 

2) Consider C = ->3x.P(x) for the cube P(x) distinct 
from all cubes tt(u)(x). Consider any vr G U" . Then |= 
7r(ft(it 8 ))(( 8 u)), and P(x) and ir(h(u*))(x) are distinct cubes, 
so -1 |= P(u 8 ). Therefore, |= C. 

3) Consider C = -^3x.tt(u)(x) A -*A(x) for some A G A \ 
Ai. Then l(A)(u) = {1}. Consider any w*. If -, j= nOu)^), 
then clearly |= C. If |= 7r(u)(u 8 ), then h(v}) — u by {7]), and 
because ft is a homomorphism, t"(u") = 1, so -1 ]= -^A(u"), 
so again |= C. 

4) Consider C = ->3a;.7r(w)(a;)Aj4(a;) for some A G .4\.4i. 
Analogously to the previous case, t,(A)(u) = {0}. Consider 
any u . If |= 7r(!i)(ii 8 ), then ft(u 8 ) = u, and because ft is a 
homomorphism, (,'(«') = 0, so -1 |= A(vr) and thus |= C. 

5) Consider C = -i3x3y.ir(ui)(x) A iv(u2)(y) A ->f(x : y). 
Then <.(/)(ui,U 2 ) = {1}. Consider any ui 8 ,n 2 8 G t/ 8 . If 
-1 |= 7r(ui)(wi lt ) or -1 |= 7r(u 2 )(u 2 11 ), then |= C. Sup- 
pose |= 7r(wi)(ui") and t= 7r(«i)(wi'). Then ft(wi') = Mi 
and ft(u2 8 ) = ^2 by (Ml, and A is a homomorphism so 
i 8 (/)Oi«,K2 8 ) = 1. Then -n ]= -./(wi*, M2 8 ) so (= C. 

6) Consider -i3a;32/.7r(wi)(:z:) A n(u2)(y) A f(x,y). Anal- 
ogously to the previous case, i{f)(ui,U2) — 0; for any 
u 1 i ,u 2 i G U\ if |= 7r(ui)(ui tl ) and |= 7r(ui)(ui*) then 



ft(«i*) = wi and ft(M 2 *) = ""2, so i 8 (/)(>i 8 , u 2 8 ) = 0, 

^|=/(wi» )W2 8)sohC- 

Direction 7 F (M<S')) C 7* (5). Let S 8 G 7p(m(S')), then all 
conjuncts of A'(S') are true in 5* 8 . We show that S" C h S 
where h is defined in the following way. Consider any u" G 
UK There is exactly one cube C[x) such that f= C(u 8 ). 
Moreover, because ^(S) contains -3x.P(x) for cubes P(x) 
other than n(u)(x), the cube C(x) is of the form n(u)(x) for 
some u G U. Define hiv}) = u. This defines the function 
h. By construction, (TTb holds. Furthermore, h is surjective: 
for each u G 17, the conjunct 3rE.7r(u)(a;) is in (J,(S), so there 
exists u 8 such that 7r(«)(« 8 ) and thus h(u") = u. We next 
show that h is a homomorphism. 

1) Let us show 

{t 8 (yl)(M 8 ) i ft(V) =«} Cl(A)(«) 

for all yl G ^l and for all it G (7. Consider A € Ai and it 8 such 
that /i(ii 8 ) = u. Then |= 7r(u)(u 8 ), so (= i" (A) («»), which 
implies i}{A){u^) G t(A)(u). Next, consider A G ^ \ Ai. 
If t( J 4)(w) = {0, 1} the property trivially holds. Consider 
i(A)(u) = {1}. Then -.3x.-k{u){x) A -<A(x) occurs in fl(S). 
Therefore, if h(u") = u, then |= yl(it 8 ), otherwise the con- 
junct would be false. Therefore, t 8 (u 8 ) = l£ i(A)(u). The 
case b(A)(u) = {0} is analogous: -^3x.tv(u)(x) A A(x) occurs 
in n(S), so if /i(u 8 ) = u then |= A^), and t 8 (it 8 ) = G 
l(A)(u). 

2) Let us show 

{«. lt (/)(wi*,M2 8 ) i fc(tt*) = ui A ft(«2* = ^2} C t(/)(«i,«a) 

(8) 
for all / G 7-" and u\,U2 G 17. This is similar to 1). If 
(,(/) = {0,1}, the inclusion trivially holds. Consider the 
case t(/)(wi,«2) = {1}. Then -3x3y.-K(u\)(x) A 7r(u2)(w) A 
-if(x,y) occurs in fi(S). Suppose that h(ui') = mi and 



h{u 2 *) 



u> . 



Then |= 7r(wi)(wi") and |= 7i(u2)(M2 t ' 



|= /(«i, U2) as well, otherwise the conjunct would be false. 
Therefore, t 8 (wi 8 ,'U2 8 ) = 1 and the inclusion ^ holds. The 
case i(f)(ui,U2) = {0} is analogous: -i3x3y.ir(ui)(x) A 
Tr(u2)(y) A f(x,y) occurs in fi(S), so if h(ui") — Ui and 
/i(«2 ) = M2, then j= 7r(«i)(iti 8 ) and |= 7r(u2)(w2 ) so 
-i |= f(u 1 ,u 2 ) and t 8 (/)(iti 8 ,U2 8 ) = 0. The inclusion ([sj 
holds, and 5" C h S. 

Because every structure S has a corresponding equiva- 
lent formula fJ.(S), we conclude modelspl] C models[_Ri]. To 
conclude models[Ti] D models[i?i], we show that /x is surjec- 
tive: every canonical conjunction F of i?i-formulas is equal 
to fJ,(S) for some structure S. 

Let F be a canonical conjunction of _Ri-literals. For each 
cube P(x) such that 3x.P(x) occurs in F, let Up( x ) be a dis- 
tinct element. Let U be the set of all such elements Upt x ). 
Property 2 of Definition [To] ensures that U is non-empty. 
Let a be such that P(x) = Aag^ M X T (A) ■ Then define 
(,(i)(«P (l )) = {a{A)} for all A G Ai. For A G ^4\^.i, define 
i(A)(upr x )) as {1} if -3x.P(x) /\-iA(x) occurs in F, as {0} if 
->3x.P(x) A A(x) occurs in F, and as {0, 1} otherwise. Such 
definition of l(up( x ) )(A ) is possible because of the Prop- 
erty 4 of Defi nition |10| Analogously, using Property 5 of 
Definition |10| for each / 6 T, define i(f)(up 1 ( x ),Up 2 ( x )) as 
{1} if -<3xy~Pi(x) A Pi\y) A -^f(x,y) occurs in F, as {0} if 
-^3xy .Pi(x) A P2(y) A f(x , y) occurs in F, and as {0, 1} other- 
wise. Let S — (U, 1). To show F — fi(S), recall first that we 



use an abstract view of the syntax that takes into account as- 
sociativity, commutativity and idempotence of conjunction. 
It therefore suffices to show that F and fi{S) contain the 
same set of conjuncts. It is easy to see that each conjunct 
of jj,(S) occurs in F. The converse is also straightforward by 
Definition H0l 

We conclude that fj, is surjective, and models[_Ri] C 
models[Ti], which completes the proof. 

Although this fact is not needed for the proof, we remark 
that /i is also injective, so [i is, in fact, a bijection between 
the set 3-STRUCT and the set of canonical conjunctions of 
7?i-literals. ■ 

We proceed to show that a syntactically richer class of 
formulas defines the same set of constraints as Ri -formulas. 

Definition 12 (i?2-formulas) An R2-formula is a dis- 
junction of (not necessarily canonical) conjunctions of Ri - 
literals. 

The proof of the following Lemma [13] provides a nor- 
malization algorithm that converts every conjunction of Ri- 
literals into an equivalent disjunction of canonical conjunc- 
tions of _Ri-literals. 



Lemma 13 Each conjunction of Ri 
as an equivalent Ri-formula. 



literals can be written 



Proof. Consider an arbitrary, not-necessarily canonical, 
conjunction F of i?i-literals. We show how to transform F 
into an equivalent disjunction of canonical _Ri-literals. The 
idea is to transform each conjunction into a disjunction of 
multiple conjunctions to ensure that all properties in Defini- 
tion [To] are satisfied. We perform the following transforma- 
tions as long as some property of Definition [TO] is violated. 
Property 1. If both 3x.P(x) and —3x.P{x) occur, use the 
rule Q A -iQ — > false and eliminate the entire conjunction 
from the disjunction of conjunctions. If none of 3x.P(x) and 
-3x.P(x) occur, use the rule true — *• Q V -^Q to introduce 
the missing P(x), and then distribute the disjunction to the 
top level of the formula. 

Property 2. First ensure that Property 1 holds. If the result- 
ing conjunction contains no conjuncts of the form 3x.P(x), 
then the conjunction contains a conjunct ->3x.P(x) for ev- 
ery P(x) a cube over A.\. Therefore, the entire conjunction 
is false and can be eliminated from the disjunction of con- 
junctions. 

Property 3. First ensure that Property 1 holds. Then, if the 
literal —3x.P(x) occurs in the conjunction, remove from the 
conjunction all literals containing P(x). Such literals are of 
the form —3x.P(x) A F\{x) for some F\(x), Sx3y.P(x) A 
Fi(x,y), for some F2{x,y), or -3x3y.P(y) A Fi(x,y), for 
some Fi(x,y); all these literals are implied by —3x.P(x) so 
removing them yields an equivalent formula. 
Property 4- If both conjuncts ->3x.P(x) A A(x) and 
-3x.P(x) A -iA(x) occur, replace them with the equivalent 
conjunct -i3a;.P(a;). 
Property 5. If both 



-3x3y. Pi(x) A P 2 (y) A f(x,y) 



and 



-^3x3y. P 1 (x)AP 2 (y)A^f(x,y) 
occur, replace them with 



then propagate the disjunction to the top level of the for- 
mula. ■ 

Lemma[l3]implies that Jfe-formulas, although a syntacti- 
cally a richer cla ss, a re no more expressive than _Ri -formulas, 
hence Corollary |14| 

Corollary 14 models[i?2] = models[_Ri] 

Proof. models[_Ri] C models[i?2] because R2 is a richer 
class of formulas. Conversely, let S* G model sfifo l- Then 

"" let F' be 
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S" = 7p(-F) for some i?2-formula F. By Lemma \ 
an _Ri-formula obtained by transforming conjunctions of F 
into disjunctions of canonical conjunctions of i?i-literals. F' 
is an R\ formula equivalent to F. Therefore, <S" = ■yp(F'), 
and 5" G models[_Ri]. ■ 

Definition 15 (Positive Boolean Combination) If 

B(pi,...,p n ) is a formula built from pi,...,p n using 
A, V, -1, we say that pt (for 1 < i < n) occurs positively in 
B(pi, . . . ,p n ) iff Pi occurs under an even number of-< signs. 
We say that B(p\, . . . ,p n ) is a positive boolean combination 
iff each of p%, . . . ,p„ occur positively in B(pi, . . . ,p n ). 

Definition 16 (_R3-formulas) An R3- formula is a positive 
boolean combination of Ri -literals. 

Lemma [TT] states that ife-formulas are simply the dis- 
junctive normal forms of i?3-formulas. 

Lemma 17 Every R^-formula is equivalent to an R2- 
formula. 

Proof. Let F be an 7?3-formula. Then the disjunctive 
normal form of F is an 7?2-formula. ■ 



Corollary 18 models[i?3] = models[i?2] 
Proof. By Lemma \T7\ m 

In the sequel we observe that replacing cubes over Ai in 
the definition of i?i-literals with boolean combinations over 
Ai does not change t he s et of expressible sets of two- valued 
structures. Definition 1 19| generalizes Definition [9] 

Definition 19 (i?4-literals) Let Bi(x),B 2 (y) range over 
arbitrary boolean combinations of elements of Ai, let Q(x) 
range over disjunctions of literals of the form A(x) and the 
form —>A(x) for A £ A. \ Ai, and let g(x, y) range over dis- 
junctions of literals of the form f(x,y) and ->f(x,y) where 

feT. 

An Ri-literal is a formula of one of the following forms: 

1. 3x.Bi(x) 

2. -^3x.B 1 (x)AQ(x) 

3. -^3x3y.B 1 (x)AB2(y)Ag(x,y) 

Definition 20 An R^-formula is a positive boolean combi- 
nation of Ri-literals. 

Lemma 21 modelsf/Li] = models[_R3] 



3x.Bx{x) W B 2 {x) -> {3x.Bx{x)) V {3x.B 2 {x)) 
-3x.Bx{x)V B 2 {x) -> (-ax.Bi(x)) A{-3x.B 2 {x)) 
-3x. {Bx{x) V B 2 (x)) A Q(x) -» 

(-ax.Bi(x) A Q(x)) A {-3x.B 2 {x) A Q(x)) 
-3x. Bx{x) A (Qx{x) V Q 2 {x)) -> 
(-t3x.Bi(x) A Qxix)) A {-ax.Bi.ix) A Q 2 (x)) 
->3x3y. {Bn{x) V B 12 {x)) A B 2 {y) A g{x,y) -» 

-^3x3y. B 11 {x)AB 2 (y)Ag{x,y) A 

-<3x3y. B 12 (x) A B 2 {y) A g(x, y) 
-^3x3y. B x {x) A {B 21 (y) V B 22 (y)) A g(x, y) -> 

-.3x3y. B\{x) A B 21 {y) A g{x, y) A 

-^3x3y. Bi_{x) A B 22 (y) A g(x, y) 
-^3x3y. Bi {x) A B 2 {y) A {gi {x, y) V g 2 {x, y)) -> 

-^3x3y. Bi(x) A B 2 {y) A gi{x, y) A 

-ax3y. Bx{x) A B 2 {y) Ag 2 {x,y) 

Figure 1: Transforming R^-literals into i?i-literals 



Proof. Note that a formula of the form -3x.Bi {x) is equiv- 
alent to the formula —3x.Bx{x) A {A{x) V-*A{x)), which is of 
the form -^3x.Bi{x) A Q{x). Therefore, R4 is a richer class 
than 7?3, so models[7?4] 13 models[i?3]. To show the con- 
verse, we transform each R4-literal into a positive boolean 
combination of i?i-literals. 

First, transform each boolean combination B{x) (and 
B{y)) of Ax predicates into canonical disjunctive normal 
form, so that each B{x) is a disjunction of cubes. Then 
apply rules in Figure IT] to decompose 7?4-literals into Ri- 
literals. ■ 

By eliminating the top-level negation from RU-literals we 
obtain Rs-literals, which use universal quantifiers. 

Definition 22 (R 5 -literal) Let Bi(x),B 2 (y) be variables 
denoting arbitrary boolean combinations of elements of Ax , 
let Qp{x) denote conjunctions of literals of the form A{x) 
and of the form ->A{x) for A 6 A. \ Ai, and let gp{x,y) 
denote conjunctions of literals of the form f{x, y) and of the 
form^f{x,y) for f G T. 

An Rs-literal is a formula of one of the following forms: 

1. 3x.Bx{x) 

2. V». Bi{x) ^Q P {x) 

3. \fx\fy. Bi_{x) A B 2 {y) =► g P {x,y) 

Definition 23 An Rs-formula is a positive boolean combi- 
nation of R5- literals. 

Lemma 24 modelsfRs] = modelsfi^i] 

Proof. Vs. Bi(x) =>■ Qp{x) corresponds to —3x.Bi{x) A 
Q{x) with Qp — -iQ, whereas VxVy. Bi{x) A B 2 {y) =>■ 
gp{x,y) corresponds to -i3x3y.Bx{x) A B 2 {y) Ag{x,y) with 
g P = -ig. m 



In the end we introduce .Re-formulas. Like heap abstrac- 
tions based on may-edges, .Re-formulas implicitly indicate 
the absence of edges by specifying the set of possible end- 
points for each edge. 

Definition 25 (Re-literals) Let Bi{x), B 2 {y) denote arbi- 
trary boolean combinations of elements of Ax, let Qp{x) de- 
note conjunctions of literals A{x) and ->A{x) for A 6 A\Ai, 
and let f denote elements of T . 

An R^-literal is a formula of one of the following forms: 

1. 3x.Bi{x) (node existence) 

2. Vr. Bx{x) =>■ Qp{x) (node properties) 

3. VrVy. Bx{x) A f{x,y) =>■ B 2 {y) (may-edges) 

4- VsVy. Bx{x) A B 2 {y) =>■ f{x,y) (must-edges) 

Definition 26 An Rs-formula is a positive boolean combi- 
nation of R^ -literals. 

Lemma 27 modelsfRs] = models[R 5 ] 

Proof. Observe first that the may-edge literal 

\/x\/y. Bx{x) A f{x,y) ^ B 2 {y) 

is equivalent to 

VzVy. Bi(ar) A -^a(l/) =►--/(*,») (9) 

which is an Rs-literal. Conversely, every R5 literal can be 
shown to be equivalent to a conjunction of may-edge and 
must-edge R6-literals using the transformation 

VxVy. B 1 {x)A^B 2 {y) => gx{x,y) A g 2 {x,y) -> 

VxVy. Bx (x) A^B 2 (y) =► 9l {x, y) A (10) 

VrVy . Bx (x) A^B 2 (y) ^g 2 (x, y) 

m 

The following CorollarypSlsummarizes the results on dif- 
ferent representations of constraints corresponding to three- 
valued structures. 
Corollary 28 

models[Ti] = 

models[Ri] = models[Ry = models[i?3] = 

models[R4] = modelsfRs] = models[Re] 

3.1 Closure under Disjunction and Conjunction 

By definition, the syntactic class of R3-formulas is closed un- 
der disjunction and conjunction. As the Corollary |29| below 
observes, this provides a way to compute the (disjunction 
and) conjunction of three-valued structures. 

Corollary 29 The family of sets models[Ti] is closed under 
union and intersection. 

Proof. The closure under union is trivial because union 
of sets of three-valued structures corresponds to the union 
of their models. For the closure under intersection, consider 
two sets of three- valued structures Sx and £2 . Let Fx be an 
R3 formula such that 7p(i 7 i) = 7* {Sx) and F 2 an R3 formula 
such that 7p(i 7 2) = 7* (52). Then Fx A F 2 is also an R3 for- 
mula, and the set of three-valued structures corresponding 
to Fx A F 2 denotes the desired intersection. ■ 



4 Three- Valued Structures with Tight Concretiza- 
tion 

This section examines the constraints that arise from the 
meaning of sets of three-valued structures under tight con- 
cretization. These constraints are slightly more expressive 
than constraints in Section [3] as Section |4.2| shows. Inter- 
estingly, the added expressive power is just enough to make 
the constraints i n thi s section closed under all boolean op- 
erations (Section |4.1[ ). These closure properties are in con- 
trast to the properties of constraints in Section |3j which 
are closed only under union and intersection. The closure 
under boolean operations allows, for example, reducing the 
implication of constraints to the satisfiability of constraints. 

The structure of this section mirrors the structure of Sec- 
tion [3] We start by defining the interpretation of three- 
valued structures under tight concretization. 

The following definition corresponds to 48, Definition 
3.6], EH Chapter 7]. Compared to our Definition B] of Sec- 
tion [3] the only difference is the use of "=" instead of "D" 
in the condition on 1. on t(A) and the condition 2. on t(/). 

Definition 30 (Tight Concretization) Let S l - 

(U,r) be a two-valued structure, let S — (U,i) be a 
three-valued structure, and let h : LP — » U be a surjective 
total function. We write S" C* S iff 

1. for every A £ A and u £ U: 

o(A)(u) = {t 8 (A)(w s ) j h(u i ) = u] 

2. for every f 6 J- and U\,Ui £ U: 

t(/)(«i,ua) = { t »(/)K 8 ,W2»)| 

/i(mi") =«l A /l(M2 tt ) = "2} 

We write S" C T S iff there exists a surjective total function 
h such that S" Cp S, and in that case we call h a homomor- 
phism. The tight concretization of a three-valued structure 
S, is given by: 

7 t(5) = {S 8 |S»EtS} 

We extend jt to 7J that acts on sets of three-valued struc- 
tures so that the set denotes a disjunction: 



f range over T . A TR\ -atomic-formula is a formula of one 
of the following forms: 



7t(5) 



U MS) 

ses 



Definition 31 (Tight Concretization Definability) 

The set of sets of two-valued structures definable via 
three-valued structure with tight concretization is defined by: 

models[T2] = (7t(S) | S a finite set of Ai -bounded 
three-valued structures} 

TR\ -literals are used to build formulas that characterize 
models[T 2 ]. 

Definition 32 (Ti?i-literal) Let P\(x),P 2 {x) range over 
cubes over Ai, let A range over elements of A\Ai, and let 



3x. Ptfx) 



3x. Pi(x) AA(x) 



3x. Pi(x) A^A{x) 
3x3y. P 1 {x)AP 2 {y)Af(x,y) 



3x3y. Pi(x) A P 2 (y) A ^f(x~yj 



A TRi-literal is a TRi-atomic-formula or its negation. 

T7?i-formulas satisfy syntactic invariants that make 
them isomorphic to three- valued structures under tight con- 
cretization. 

Definition 33 (TRi-formulas) Let P(x), Pi(x), P 2 {y) 
denote cubes over Ai- A canonical conjunction of TRi lit- 
erals is a conjunction of TRi-literals that satisfies the fol- 
lowing conditions. 

1. for each P(x) a cube over A\, exactly one of the con- 
juncts 3x.P(x) and ->3x.P(x) occurs; 

2. there is at least one cube P(x) such that the conjunct 
3x.P(x) occurs in the conjunction; 

3. if the conjunct ~<3x.P(x) occurs, then this conjunct is 
the only occurrence of the cube P(x) (and the cube 
P(y)) in the conjunction; 

4- for each cube P(x) and A G A \ Ai, exactly one of the 
following three conditions holds: 

(a) ->3x. P(x) A A(x) occurs in the conjunction, 

(b) ->3x. P(x) A -<A(x) occurs in the conjunction, 

(c) both 3x. P{x) A A(x) and 3x. P(x) A ->A(x) occur 
in the conjunction; 

5. for every two cubes Pi(x) and P 2 (y) and every f £ T , 
exactly one one of the following three conditions holds: 

(a) ->3x3y. Pi{x) A P 2 (y) A f(x,y) occurs in the con- 
junction; 

(b) Sx3y. Pi(x)AP 2 (y)A^f(x,y) occurs in the con- 
junction; 

(c) both 3x3y. Pi(x) A P 2 (y) A f{x,y) and 
3x3y. P\(x) A P 2 (y) A ~if{x,y) occur in the 
conjunction. 

A TRi-formula is a disjunction of canonical conjunctions of 
TRi-literals. 

The following Proposition [34] shows that TRi formulas 
capture precisely the meaning of three- valued stru ctu res un- 
der tight concretization. The proof of Proposition |34| is sim- 
ilar to the proof of Proposition |11| and is similarly straight- 
forward. 

Proposition 34 modelsfTBi] = models^] 

Proof. The idea of this proof is similar to the idea of the 
proof of Proposition[TT| the meaning of each bounded three- 
valued structure under the tight concretization is equal to 
the meaning of some canonical conjunction of T7?i-formulas, 
and conversely. Therefore, disjunctions of canonical con- 
junctions of TBi-literals correspond to sets of bounded 
three- valued structures under the tight concretization. 



We next give a function fi mapping each bounded three- 
valued structure S to a canonical conjunction of TR\ -literals 
fi(S). We show that S under tight concretization and (J,(S) 
represent the same set of two- valued structures. Moreover, 
H is surjective. 

Let S — {U, i) be an Ai-bounded three-valued structure. 
Define the formula fi(S) as the conjunction of the following 
TRi -literals. Define n as in the proof of Proposition |11| 
Node existence. For each u G U, introduce the TiZi-literal 



3x.n(u)(a 



(11) 



For each remaining Ai-cube P(x), that is, for each cube 
P(x) such that 7r(y)(a;) 7^ P(x) for all u G U, introduce the 
rili-literal 

-^3x.P(x) (12) 



Node properties. Let u G U and A G A \ Ax- 
{!}, introduce the TiJi-literal 



If t(A)(u) 



(13) 



-3x. tt(u)(x) A -iA(x) 

If b(A)(u) = {0}, introduce the literal 

-*3x. n(u)(x) A A(x) 

If i(A)(u) = {0, 1}, introduce the following two TRi -literals 

3a:. n(u)(x) A A(x) 
3x. n(u)(x) A -iA(x) 



(14) 



(15) 



Edges. Let ux,U2 G U (we allow ux — M2) and let / G T. If 
t(f)(u 1 , U2) = {1}, introduce the Ti?i-literal 



-^3x3y. ir(ux)(x) A Tv(u 2 ){y) A -<f(x, y) 
If L (f)( u x, u 2) = {0}, introduce the T7?i-literal 
-^3x3y. ir(ui)(x) A Tv(u 2 )(y) A f(x,y) 



(16) 



If t(/)(ui,M 2 ) 

literals 



{0, 1}, introduce the following two TRi 



3x3y. tt(ux)(x) A ir(u%)(y) A /(at, y) 
3x3y. 7r(ui)(a;) A Tv(u 2 )(y) A ->f(x, y) 



(18) 



De fine f orm ul a fi (S ) as t he co njun ction of all formulas 
(TT|, |l2j, jl3|, {m}, |15|, |16|, (fif) , jl8|, introduced as 
described above. We next show 7p(a*(o)) = 7*(S I ). As in 
the proof of Proposition 1 1 1 1 we establish and then use the 
property (TTj) for a homomorphism ft from S* to S. 
Direction j£(ji(S)) 2 y*(S). Let S" G y*{S). Then S« C h 
5 1 for some homomorphism ft. We establish that vfj holds 
for ft. For A E Ax, we have {>*(«*) (A)} = <-(u)(A). Then 
|= A"' '(it"). Therefore, the conjunction |= 7r(u)(u"), which 
establishes (M). We next show S" \— C for each conjunct C 
of fj,(S). 

1) Consider C = 3a;.7r(ii)(a;) for some u. Because ft is a 
surjection, h(ur) = u for some u , so (= 7r(u)(w*), and the 
conjunct |= C. 

2) Consider C = -i3a:. P(a;) for the cube P(x) distinct 
from all cubes tv(u)(x). Consider any u" G U . Then |= 
w(h(tr)(( u)), and P(x) and ir(h(u"))(x) are distinct cubes, 
so -1 |= P(w lt ). Therefore, |= C. 



3) Consider C = -i3x.it (u)(x) A ->A(x) for some A G 
A\ Ax- This means that l(A)(u) — {1}. Consider any u . 
If ^J= 7r(u)(u tt ), then |= C. If j= 7r(u)(u tt ), then ft^ 8 ) = it 
by (M), and because ft is a homomorphism, r(w") = 1, so 
-1 |= ^A(« 8 ). Hence (= C. 

4) Consider -i3a;.7r(w)(a;) A A(x) for some A £ A\ Ax- 
Analogously to the previous case, l(A)(u) = {0}. Consider 
any u. If 7r(u)(u"), then ft(u') — U, and because ft is a 
homomorphism, (,'(«") = , so -1 |= A(u'). Hence |= C. 

5) Consider conjuncts (151. These conjuncts occur only 
when l(A)(u) — {0, 1}. By the definition of tight concretiza- 
tion, there exists v, G IP such that ft(ir) = -k{u) and 
^(A^u 8 ) - 1. By property (R of ft, |= 7t(m)(m 1 ') and thus 
|= 3a;.7r(w)(a:) A A(a:, y). Analogously, by the definition of 
tight concretization there exist v* G U" such that h(vfi) = u, 
and ^(A)^ 8 ) = 0, so |= 3a;. n(u)(x) A ->A(x). 

6) Consider C = -i3x3y.Tr(ui)(x) A iv(u2)(y) A -*f(x,y). 
Then t(/)(«i,w 2 ) = {1}. Consider any u^ ,u^ G f/ 8 . If 
-1 |= 7r(wi)(ui") or -1 |= 7r(«i)(wi"), we have |= C. Sup- 
pose 7r(«i)(«i") and tv(u2)(u2 )■ Then ft(«i") = ux and 
ft(w2*) = W2; ft is a homomorphism so t*(/)(ui" ,112) = 1, 
-n |=^/( Ml tt , U2 tt ) and ^C. 

7) Consider C = ^3x3y.7r(iii)(x) A 7r(u2)(j/) A f(x,y). 
Analogously to the previous case, l(f)(ux, M2) = 0; for any 
it" G {/", if |= tt^iXV) and |= ^(ua)^ 11 ), then ft(wi«) = 

ux and ft(u 2 l) ) = «2, so t*(/)(wi l .«2 1 ') = °» ^ N fCwi 8 ,^) 
and |= C. 

8) Consider conjuncts (18 1. These conjuncts occur 
only when i / (f)(ux,U2) — {0, 1}. By the definition of 



tight concretization, there exist u\,u\ G U* such that 
h(u\) = mi, h(u\) — U2, and t'(/)(u', w|) = 1. By prop- 
erty (Ml of ft, j= 7r(«i)(tti) and |= 7t(u2)(m 2 ), and thus 
|= 3x3y. 7r(ux)(x) A ir(u2)(y) A f(x,y). Analogously, by the 
definition of tight concretization, there exist v\,v\ G U* 
(17) such that h(v\) — Ux, h(v\) = U2, and tr{f){v\, v\) = 0, so 



|= 3x3y. k(ux)(x) A n(u 2 ){y) A f(x, y). 
Direction 7f(m(«5)) Q 1*(S)- Let S 8 G 7f(()m(5)), then all 
conjuncts of /^(S 1 ) are true in S 1 ". We show that S* c' 1 S 
where ft is defi ned in the same way as in the proof of the 
Proposition [TT] so (FF| holds and is surjective. We show that 
the homomorphism conditions of Definition [30] are satisfied 
for ft. 

1) Let us show 

{^{A)^) j h(u { ) =u} = t(A)(u) 

for all A G A and for all u G U. Consider A G Ax and 
any u" such that ft(u lt ) = u. Then 7r(w)(w lt ), so A a(A) (u*), 
which implies {t'(A)(u*)} = t(A)(w). Moreover, because 
3a;.7r(«)(a;) holds, the left-hand side side is a non-empty set, 
so it is equal to l(A)(u). 

Next, consider A G A \ Ax- Consider first the case 
l(A)(u) = {1}. Then -^3x.tt(u)(x) A ->A(x) occurs in fx(S). 
Therefore, if ft(u') = u, then A(ur) holds, otherwise the con- 
junct would be false. Therefore, t}(u") = 1. The left-hand 
side is non-empty so the equality holds. 

The case t(A)(u) = {0} is analogous: -i3x.ir(u)(x)AA(x) 
occurs in ^(S), so if h(u") — u then A(u") holds, so o'(u") = 
and the left-hand side is non-empty so t he e quality holds. 

If l(A)(u) = {0, 1} then the conjuncts (15} hold. There- 
fore, there exists a node it* G U" such that h(u") — u and 
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l'(A)(v}) = 1, and there exists a node v" G t/' such that 
h^) - u and fi(A)(u*) = 0. The left hand-side is a set 
containing both and 1, so it is equal to {0, 1}. 
2) Let us show 



{i\f)(ui\u% % ) I ft(««) = ui A hCtia*) = u 2 } 



t(/)(ui,M 2 ) 

(19) 



Then 

Sup- 

) and 



for all f £ J- and ui,«2 G f ■ This is similar to 1). 

Consider first the case i(f)(u\, 112) = {1}- 
-i3x3j/.7t(mi)(x) A 7r(w2)(2/) A ->f(x,y) occurs in p(S') 
pose that h(uv) = Mi and U2 = U2- Then 7r(«i)(ui 
7r(M2)(ti2 ) hold, so f(ui,Ui) must hold as well, otherwise 
the conjunct would be false. Therefore, i?(uv, 112 ) = 1. 
Moreover, because of the conjunct 3a:.7r(ui)(x) and the con- 
junct 3a;.7r(u2)(a;), the left-hand side is non-empty, so it is 
equal to {1}. 

The case l(f)(ui,U2) — {0} is analogous: 
-i3a;32/.7r('Ui)(a;) A n(u2){y) A f(x,y) occurs in fi(S), 
so if ft(iti') — ui and U2 — «2, then w(ui)(ui") and 



n(u2)(u2 ) so f (111,112) is false and r(ui",U2") = 0. More- 
over, because of the conjunct 3a;.7r(wi)(a;) and the conjunct 
3x.tt(u2)(x), the left-hand side is non-empty, so it is equal 
to {0}. 

Finally, consider the case l(A)(u) = {0,1}. Then the 
conjuncts (18 1 hold in £r. Because the first conjunct holds, 
there exist u\ and u\ such that h(u\) = Ui, h(u\) — U2 
hold and and $b(f)(u\, u\) — 1. Therefore, 1 belongs to the 
left-hand side of [19] Similarly, because the second conjunct 
holds, belongs to the right-hand side of of |19| There- 



fore (191 holds 



We conclude that S* CJ S. Because every struc- 
ture S has a corresponding equivalent formula fx(S), we 
have models^] C models[Ti2i]. To conclude models^] 3 
models[ri?i], we show that /i is surjective. 

Let F be a canonical conjunction of TR\ -literals. For 
each cube P(x) such that 3x.P(x) occurs in F, let Upr x -\ be 
a distinct element. Let U be the set of all such elements 
u P(x)- Property 2 of Definition 33 ensures that U is non- 
empty. Let a be such that P(x) — /\ AeA A(x) a{A) . Then 
define t(A)(u P ( x )) = {a(A)} for all A G Ai. For A G A\Ai, 
define b(A)(up( x )) as {1} if -^3x.P(x) A ->A(x) occurs in F, 
as {0} if ->3x.P(x) A A(x) occurs in F, and as {0, 1} other- 
wise. Such definition of c (up ( x ) ) (A) is possible because of the 
Property 4 o f D efinition |33| Analogously, using Property 5 
of Definition |33| for each / G J-, define (■(f)(up 1 ( x ),Up 2 ( x )) 
as {1} if -3xy.Pi(x) A P2(y) A -*f(x,y) occurs in F, as {0} 
if -3xy.Pi(x) A P2{y) A f(x,y) occurs in F, and as {0, 1} 
otherwise. Let S = (U,b). To show F = fi(S), it suffices to 
show that F and n(S) contain the same set of conjuncts. It 
is easy to see that each conjunct of fi(S) occurs in F. The 
converse is also straightforward by Definition |33| 

We conclude that fi is surjective, and models^] D 
models[T7?i], which completes the proof. 

It is easy to see that /1 is, in fact, a bijection between 
the set 3-STRUCT and the set of canonical conjunctions of 
TTJi-literals. ■ 

As in Section [31 we proceed to show that we can permit 
a richer syntactic structure without changing the expressive 
power of constraints. 

Definition 35 ( T7?2-formulas) A TR2-formula is a dis- 
junction of conjunctions of TR\-literals. 



The following Lemma [36] is analogous to Lemma |13| it 
shows that any conjunction of TR\ literals can be trans- 
formed into an equivalent disjunction of canonical conjunc- 
tions of TRi literals. 

Lemma 36 Each conjunction of TRi-literals can be written 
as an equivalent TR±-formula. 

Proof. Consider an arbitrary, not-necessarily canonical, 
conjunction F of TR\ -literals. We show how to transform 
F into an equivalent disjunction of canonical conjunctions 
of TR\ -literals. The idea is to transform conjunctions into 
disjunctions of multiple conjunctions to ensure that all prop- 
erties in the Definition [33] are satisfied. We perform the fol- 
lowing transformations as long as any of the properties in 
Definition [33] is violated. 

Property 1. If both 3x.P(x) and —3x.P(x) occur, the entire 
conjunction is false and we eliminate it from the disjunction 
of conjunctions. If none of the 3x.P(x) and Sx.P(x) use 
the rule true — > (3x.P(x)) V (-3x.P(x)) and then distribute 
the disjunction to the top level of the formula. 
Property 2. First ensure that Property 1 holds. If the result- 
ing conjunction contains no conjuncts of the form 3x.P(x), 
then the conjunction contains a conjunct ->3x.P(x) for ev- 
ery P(x) a cube over Ai. Therefore, the entire conjunction 
is false and can be eliminated from the disjunction of con- 
junctions. 

Property 3. Suppose that the literal -3x.P(x) occurs in 
the conjunction. If the conjunction contains a literal of 
one of the forms 3x.P[x) A Q(x), 3x3y.P(x) A Q(x,y), or 
3x3y.P(y) A Q(x,y), then the entire conjunction is contra- 
dictory and may be omitted from the disjunction of conjunc- 
tions. If t here are no such conjunctions, then (as in the proof 
of Lemma 13 1 remove all , literals of forms -<3x.P(x) AQ(x), 
-^3x3y.P(x)A Q(x,y), and -i3x3y.P(y) A Q(x,y). because 
they are implied by ^3x.P(x). 

Property 4- If both a literal and its negation occur in the 
conjunction, the entire conjunction is false. Hence, we can 
assume that (a) and (c) do not occur simultaneously and (b) 
and (c) do not occur simultaneously. To ensure that (a) and 
(b) do not occur simultaneously, use the replacement rule 



(-da:.P(aO A A(x)) A (p3x.P(x) A ^A(x)) 



i3x.P(x) 



and then ensure again the Property 3. We have thus shown 
how to ensure that no two of the cases (a), (b), (c) hold 
simultaneously. To ensure that at least one of the cases (a), 
(b), (c), holds, use the fact that -^pV^q\/(pAq) is atautology, 
and apply the rule 

true -> (-i3x.P(x) A A{x)) V 
(-.3x.P(z) A -^A(x)) V 
(3x.P(x) A A(x)) A (3x.P(x) A ->A(x)) 

Then propagate the disjunction to the top level of the for- 
mula. Then ensure that no two cases apply simultaneously, 
as described previously. 

Property 5. Ensuring Property 5 is analogous to ensuring 
Property 4. If both a literal and its negation occur in the 
conjunction, the entire conjunction is false. Hence, we can 
assume that (a) and (c) do not occur simultaneously and (b) 
and (c) do not occur simultaneously. To ensure that (a) and 
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(b) do not occur simultaneously, use the replacement rule 

{-,3x3y.Px{x)AP 2 {y)Af(x,y))A 

(-,3x3y.Px(x)AP 2 (y)A^f(x,y)) -> 
{^3x.Px{x)) A{^3y.P 2 {y)) 

and then ensure again Property 3. We have thus shown 
how to ensure that no two of the cases (a), (b), (c) hold 
simultaneously. To ensure that at least one of the cases (a), 
(&), (c), holds, use the fact that ^pV^gV(pAq) is a tautology, 
and apply the rule 

true -» (-^JxJy.Pi (x) A f(x, y)) V 

{- l 3x3y.P 1 (x)AP 2 (y)A^f(x,y)) V 
{3x3y.Px{x) A f{x,y)) A 

(3x3y.P 1 (x) A P 2 (y) A^f{x,y)) 

Then propagate the disjunction to the top level of the for- 
mula. Then ensure that no two cases apply simultaneously, 
as described previously. ■ 



3i.Bi(i)vB 2 (i) -+ (3x.B 1 (x)) V (3x.B 2 (x)) 
3x. (Bi(x) V B 2 (x)) A Q(x) -» 

(3x.Bi(x) A Q(x)) V (3x.B 2 (x) A Q(x)) 
3x. B 1 (x)A(Q 1 (x)vQ 2 {x)) -> 
(3x.Bi(x) A Qi(x)) V {3x.Bi(x) A Q 2 (x)) 
3x3y. (B n (x) V Bia(a;)) A B 2 (y) A g(x, y) -> 

3x3y. fin (as) A B 2 (y) A g(x, y) V 

3x3y . B 12 (x) A B 2 (y) A g(x, y) 
3x3y. Bx (x) A (B 21 (y) V B 22 (y)) A g(x, y) -> 

3x3y. Bx (x) A B 21 (y) A g(x, y) V 

3x3y . Bx (x) A B 22 (y) A g(x, y) 
3x3y. Bx{x) A B 2 (y) A (gx(x,y) V g 2 (x,y)) -> 

3x3y . Bx (x) A B 2 (y) A gx (at, y) A 

3x3y. Bx (x) A B 2 (y) A g 2 {x, y) 



Corollary 37 models[TB 2 ] = models[ri?i] 

Proof. Every T7?i-formula is a TR 2 -formula. Conversely, 
let F be a T7?2-formula. Then F is a disjunction of conjunc- 
tions of TRx- By Lemma [36} transform each conjunction of 
F into a disjunction of canonical conjunctions of TRx liter- 
als. The result is a T7?i-formula. ■ 

TR3 -formulas remove the disjunctive normal form re- 
quirement on T7?2-formulas. 

Definition 38 ( TV? 3 -formulas) TRz-formula is a boolean 
combination of TRx-atomic-formulas. 

T"i?2-formulas are the disjunctive normal forms of TR$- 
formulas. 

Lemma 39 Every TR3 formula is equivalent to a TR 2 for- 
mula. 

Proof. Let F be a T7? 3 formula. Then the disjunctive 
normal form of F is a TR 2 formula. ■ 



Corollary 40 models[ri? 3 ] = models[r# 2 ] 

Proof. Every TR 2 formula is a TR3 formula, so 

models[ Ti? 3 ] D models[Ti? 2 ]- The converse models[77? 3 ] C 
modelsfri?2] follows from Lemma 39 ■ 



Analogously to Rx formulas in Section I3J we introduce 
TRx formulas that allow using boolean combinations of more 
complex atomic formulas. 

Definition 41 ( T7?4-formulas) Let Bx(x),B 2 (y) be range 
over arbitrary boolean combinations of elements of Ax , let 
Q(x) range over disjunctions of literals of form A(x) and 
-iA(x) where A g A\ Ax, and let g(x,y) range over dis- 
junctions of literals of the form f(x, y) and -if(x, y) where 

A TRx-atomic-formula is a formula of one of the follow- 
ing forms: 

1. 3x. B x {x) 



Figure 2: Transforming Ti?4-literals into T7?i-literals 

2. 3x. Bx(x)AQ(x) 

3. 3x3y. Bx(x) A B 2 (y) A g(x,y) 

A TRx-literal is a TRx-atomic-formula or its negation. 
A TRi-formula is a boolean combination of TR^-atomic- 
formulas. 

Lemma 42 models[T_R 4 ] = models[T7? 3 ] 

Proof. Each formula of the form -i3x.Bx(x) is equivalent 
to the formula -<3x.B 1 (x) A (A(x) V -<A{x)), which is of the 
form -i3x.Bx{x)AQ{x). Therefore, TRx is a richer class than 
TR3, so models[T7?4] D models[ TR3]. To show the converse, 
transform each T7?4-literal into a boolean combination of 
TiJi-literals. 

First, transform each boolean combination B(x) (and 
B(y)) of Ax predicates into canonical disjunctive normal 
form, so that each B(x) is a disjunction of cubes. Then 
apply rules in Figure[2]to decompose T7?4-literals into TRx- 
literals. ■ 

Instead of existential quantifiers, we may use atomic for- 
mulas that contain universal quantifiers. 

Definition 43 ( T7? 5 -formulas) Let Bx(x),B 2 (y) denote 
arbitrary boolean combinations of elements of Ax, let Qp(x) 
denote conjunctions of literals of the form A(x) and -*A(x) 
for A £ A \ Ax, and let gp(x,y) denote conjunctions of 
literals f(x,y) and ->f{x,y) for f e T. 

An TR5- atomic- formula is a formula of one of the fol- 
lowing forms: 

1. Wx.Bx(x) 

2. Vx.Bx(x) ^Qp(x) 

3. Vx. Bx(x) => Vy. B 2 (y) => g P (x, y) 

A TR^-formula is a boolean combination of TR^-atomic- 
formulas. 



12 



TR$ — atomic formula 



Vx. Bi(i) 

Vx. Bi(af) =>■ (Li (x) A ... A L fc (x)) 

Vx. Bi(x) => Vy. B 2 (y) => (ii(», y) A . . . L k (x, y)) 



TR4 — formula 



-.3a;. -.Bi(a;) 

^3x. Si(x) A (Ii(a;) V ... VL h (x)) 

,3x3y. Bi(x) A B 2 (y) A (Li(x,y) V . . .L k (x,y)) 



Figure 3: Mapping Ti?5-atomic-formulas to TTiU-formulas 



TR4, — atomic formula 
3x. Bjjx) 



TR5 — formula 



-.Vx.^Bi(x) 

->Vx. Si (a;) A (Li(x) A . . . AL fe (x)) 

-.Vs. Bi(x)^Vy. B 2 (y)^ (E 1 (x,y)V...VL k (x,y)) 



3x. Bi(x)A(Li(x)V...L h (x)) 

3x3y. Bi(x) A B 2 (y) A (Li(x, y) A ... A L k (x,y)) 



Figure 4: Mapping Ti?4-atomic-formulas to TBs-formulas 



Lemma 44 models[r# 5 ] = models[r7? 4 ] 

Proof. For each T7?5-atomic-formula there exists a 

corresponding equivalent T7?4-formula, and for each TR4- 
atomic-formula there exists a corresponding equivalent TR5- 
formula. 

The mapping from Tii^-atomic-formulas to TR4- 
formulas is in Figure[3j the mapping of 77?4-atomic-formulas 
to ri?5-formulas is in Figure U] We use the notation L to 
denote L\ if L is of the form -.Li for some Li, and -^L if L 
is not of the form -<L\ for some L\ . m 

The following Corollary [45] summarizes the results on dif- 
ferent representations of constraints corresponding to three- 
valued structures with tight concretization. 

Corollary 45 

models[T 2 ] = models[Ti?i] = models[T7? 2 ] = 
models[T_R 3 ] = models[T_R 4 ] = models[Ti? 5 ] 

Definition 46 (Boolean Shape Analysis Constraints) 

We call the set of sets models[T 2 ] boolean shape analysis 
constraints. 

4.1 Closure under Boolean Operations 

By definition, Ti?3-formulas are closed under all boolean 
operations. 



Corollary 47 The family of sets models[T 2 ] forms a 
boolean algebra of sets which is a subalgebra of the boolean 
algebra of all subsets o/2-STRUCT. 

As an example consequence of closure under all boolean 
set operations we obtain the following proposition. 

Proposition 48 There is an algorithm that constructs, 
given two finite sets of bounded three-valued structures Si 
and 5 2 , a finite set of bounded three-valued structures S3 
such that: 

7t(5i) C 7t(5 2 ) iff 7t(5 3 ) = 

Similarly, the equivalence of two three-valued structures re- 
duces to the satisfiability. 



Proposition 49 There is an algorithm that constructs, 
given two finite sets of bounded three-valued structures Si 
and <S 2 , a finite set of bounded three-valued structures S3 
such that: 

7t(5i) = 7t(5 2 ) iff 7t(5 3 ) = 

4.2 Relationship with Non-Tight Concretization 

In Proposition [50] below we observe that th ree- valued struc- 
tures with tight concretization (Definition 30 1 are at least 
as expressive as three-valued structures with concretization 
(Definition B. 

Proposition 50 models[T 2 ] D modelspl]. 

Proof. By definition, every 7?4-formula is a T7?4-formula, 
so models[Ti?4] D models[_R4]. Therefore, 

models[T 2 ] = models[Ti? 4 ] 3 models[i? 4 ] = models[Ti] 



Proposition [50] implies that, even if we work with the 
interpretation of three-valued structures under concretiza- 
tion, we can convert three-valued structures into boolean 
shape analysis constraints and check for entailment or equiv- 
alence of the original constraints via satisfiability of the, 
richer, boolean shape analysis constraints. In fact, the 
Proposition [51] below shows that boolean shape analysis 
constraints models[T 2 ] are the smallest extension of the con- 
straints models [Ti] which have this desirable property. 

Proposition 51 models[T 2 ] is the smallest superset of 
models[Ti] that is closed under all boolean operations. 



Proof. models[T 2 ] D models[Ti] by Proposition 50 

and models[T 2 ] is closed under all boolean operations by 



Corollary 
models[Ti 



47 



Because models[T 2 ] = modelsfTTiU] and 
models[7?4], it remains to show that every TR4 



formula is (equivalent to) a boolean combination of some 
i?4-literals. By definition, T7?4-formulas are boolean combi- 
nations of TR4 atomic formulas, so it suffices to show that 
each TR4 atomic formula is a boolean combination of R4 
literals. That is certainly true, in fact, it suffices to use at 
most one negation of an R4 literal to obtain any TR4 literal. 
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4.3 Node Splitting 

Given a three-valued structure S = (U,i), it is desirable if 
i(A)(u) € {{0}, {1}} for all u G 17 and AeA This property 
holds if all predicates are abstraction predicates, that is, if 
A\ = A. The following Proposition [52] shows that we can 
always assume that Ai — A if the syntactic class of formulas 
is sufficiently rich. 

Proposition 52 Every TR^-formula with the set of ab- 
straction predicates Ai C A is also a TR4 formula with 
the set of abstraction predicates Ai — A. 

Proof. Observe that, in the ato mic formula Fi(x) = 
3x. Bi(x) A Q(x) of the Definition |41[ the subformula 
B-2(x) = Bi(x) A Q{x) is a boolean combination of predi- 
cates from A, so F\(x) is of the form 3x.B2(x) for a boolean 
combination of predicates from A. m 

Note that the converse of Proposition[52]is not true. For 
example, if A, A' G A \ Ai then the property ->3x. A(x) A 
-<A'(x), which correlates two non-abstraction predicates is a 
T'i?4-formula with the set of abstraction predicates A, but is 
not equivalent to any T7?4-formula with the set of abstrac- 
tion predicates Ai- 

Definition 53 (Split Form) Let F\ be a TR^-formula 
with the set of abstraction predicates Ax C A. By Propo- 
sition \52\ and Corollary \45\ let F2 be a TR\ -formula with 
the set of abstraction predicates A such that F2 is equivalent 
to F\ . We call F2 the split form of F\ . 

Letting Ai = A in Definition |41| we obtain the following 
Corollary [54] 

Corollary 54 (Split Form Formulas) The set of split 
forms of TR4 formulas is precisely the set of boolean combi- 
nations of formulas of the form 

1. 3x. B^x) 

2. 3x3y. B 1 (x)AB2(y)Ag(x,y) 

where Bi(x), £2(2/) are boolean combinations of literals of 
the form A(x) and A(y) for A G A, and g(x,y) ranges over 
disjunctions of literals of the form f(x, y) and ~<f(x, y) for 

feT. 

5 Decidability of Independent Predicates 

In this section we present decidability results for constraints 
expressed by three- valued structures under tight concretiza- 
tion. We show that satisfiability, entailment and equiva- 
lence of boolean shape analysis constraints are all decidable. 
Boolean shape analysis constraints (T7?i-formulas) are more 
expressive than i?i-formulas by Proposition |50| so we obtain 
decidability results for R\ -formulas as well. 

Formulation of Decidability Problems We assume fi- 
nite sets A and T of predicates. As a result, the num- 
ber of non-isomorphic bounded three- valued structures, and, 
therefore, the number of non-equivalent _Ri-formulas, is fi- 
nite. Therefore, for fixed A and JF, a problem of the form: 

Given a TTJi-formula F, is F satisfiable? 



is essentially finite and therefore trivially decidable. How- 
ever, we are interested in having a single algorithm that 
would give decidability for any number of unary and binary 
predicates. Therefore, the size of sets A and T is part of 
the input to the decision procedure we are looking for. For 
example, we are interested in the questions of the form: 

Given sets A and T and a T7?i-formula F over 
predicates A and T , is F satisfiable? 

In this section we study such decidability questions for in- 
dependent predicates, when the three-valued struct ures are 
interpreted over the entire set 2-STRUCT. Section [6J4] ad- 
dresses the more general case where some of the predicates 
are defined using first-order formulas, which means that for- 
mulas are interpreted over a subset of 2-STRUCT. 

Satisfiability of Ti?i-formulas over 2-STRUCT is decid- 
able. In fact, the proof of the following Lemma [55] shows 
that every disjunct of a TBi-formula has a small model in 
2-STRUCT. 

Lemma 55 Let F be a canonical conjunction of TRi- 
literals and let the number of cubes P(x) over Ai such that 
3x.P(x) occurs in F be n. Then there exists a two-valued 
structure S" = {U , (.*} such that \U"\ = In and F is true in 

s*. 

Proof. Let S = {U, t) be the stru cture that corresponds 



to F by the proof of Proposition 34 Let U = {ui 



Define 5" = {U',^} as follows. Let U* = {u\ 



»}■ 
J- 

In the sequel we define ir so that 5' c{, S where h is given 

by 

h[u\i) = Ui 

for 1 < i < n. By definition, h is surjective. 

Define t*(A) for A G Ai as follows. Let 1 < i < n. Then 
i{ui) = {0} or i{ui) — {1}. If i(ui) = {0}, define 

*'(^)(4i-i) = J(*)(4i) = 

If b(ui) = {1}, define 

t*(A)(t4-i) = ^(^)(«L) = 1 

Define i}(A) for A G A\Ai as follows. Let 1 < i < n. If 
i{ui) = {0}, define 

*'(^)(i4-i) = ^(^)(«L) = 

If i{ui) = {1}, define 

t'(A)(t4i_i) = ^(A)(ul) = 1 
If i{ui) = {0, 1}, define 

i«(jl)(i&-i)=0 

i\A){ul) = 1 

Define t"(/) for / G T as follows. Let 1 < i,j '< n. If 

i (f)( u ij u j) = {0}, define 

t »(/)(4,K»)=0 
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forfc G {2i-l,2i} and/ G {2j-l,2j}. If l(f)(ui,Uj) = {1}, 
define 

t*(/)(«t,«f) = l 
for fc G {2i - l,2i} and i G {2j - l,2j}. If i(f)(Ui,Uj) = 
{0,1}, define 

**(/)(«!,««,.) = 1 
for fcG {2i- 1,2*'}. 

It is straightforward to show S" C* S\ Therefore, i* 1 
holds inSK m 

Corollary 56 y£(S) = iff S = 0. 
Proof. By Lemma [55] ■ 

We note that the construction of the model in Lemma l55l 
becomes even simpler if we assume that the formula F is in 
the split form. Corollary [56] then follows from the observa- 
tion that if F has at least one disjunct then split form of F 
has at least one disjunct. 

Corollary 57 The following questions are decidable for 
sets Si , 52 of three-valued structures: 

1. 7t(5i) C 7t(5 2 ); 

2. 7t(5i) = 7t(5 2 ). 

Proof. By Corollary |56| Proposition |48| and Proposi- 
tion [49] ■ 

6 Structures with Denned Predicates 

In this section we introduce the notion of a three-valued 
structure with defined predicates. Previous sections in- 
terpret three-valued structures and formulas over the set 
2-STRUCT of all two-valued structures. In general, it is 
useful to interpret three- valued structures and formulas over 
some subset 2-CSTRUCT C 2-STRUCT of compatible two- 
valued structures [491 Page 268]. 

6.1 Compatible Structures 

We view structures with defined predicates as a way of defin- 
ing a subset 2-CSTRUCT C 2-STRUCT. 

Definition 58 (Compatible Structures) Let Ai C A 

be a set of defined unary predicates and Ti C T be a 
set of defined binary predicates. Let 2-SEM-STRUCT C 
2-STRUCT be the set of two-valued structures that satisfy 
the constraints of the semantics of the programming lan- 
guage. Next, for each A G A2, and each two-valued structure 
5" G 2-STRUCT where S* = (U*,!?), let d A (S*) : U* -> 
{0, 1} be a unary predicate. For each f G Ai, and each two- 
valued structure S* G 2-STRUCT where S s = ([/".i"), let 
df{S^) : (U ) 2 — > {0, 1} be a binary predicate. Define 

2-CSTRUCT = {S* = (U\i}) \ 
S* G 2-SEM-STRUCT A 
A Vu" G UK t}{A)(vP) = dAiS^iJ) A 

AeA 2 

A V«i*,«a' e UK t(/)(ui», M2 8 ) = d f (S t )(ui t ,u2 t )} 
fer 2 

(20) 



Definition [59] below introduces tight concretization with 
respect to compatible structures, in the natural way. 

Definition 59 (Compatible Tight Concretization) If 

S C 3-STRUCT is a set of three-valued structures, define 

cri(S) = 7t(5) n 2-CSTRUCT 

We then use cy? to define the class of d efinable sets 
modelsfcTa]. With the results of Section |4.3| in mind, we 
let Ai = A. 

Definition 60 The set of sets of compatible two-valued 
structures definable via three-valued structure with tight con- 
cretization is defined by: 

models[cT2] = {c7y(S) | S a finite set of Ai -bounded 

three-valued structures} 

Lemma 61 

models[cT 2 ] = {5* n 2-CSTRUCT | 5* G models[T 2 ]} 
Proof. Immediate by Definition |60| and Definition |59| ■ 

6.2 Formulas for Compatible Structures 

In Section |4] we have characterized sets of two- valued struc- 
tures using formulas. We now characterize sets of compatible 
two-valued structures by conjoining the formulas with the 
compatibility formula. 

Definition 62 (Compatibility Formula) Let ipo be a 

sentence that axiomatizes the set 2-SEM-STRUCT, so that: 

models^o}] = 2-SEM-STRUCT 

Let the value of each predicate dA{S") for A G A2 be equal 
to the Tarskian semantics of some formula i))a{x) in the 
structure S : 

dA(5*)(«*) = [Vii(a!)] 5 ' [* •-»«*] 

and let the value of each predicate df(S') for f G To, be equal 
to the Tarskian semantics of some formula ipf(x,y) in the 
structure 5'.' 



d f (S t )(u 1 \u 2 t ) = li> A (x)f [x^ Ul \y^ 

Define the compatibility formula Ey, by: 

Ft/, =e Vo A 

iPa{x) A 

=*• ipf(x,y) 



U-2 



A Vz. A(x) <s= 
AeA 2 
A Vxy. f(x,y) 

fST 2 



For each class of formulas TRi we introduce the corre- 
sponding class cTRi by conjoining the formulas with F$. 

Definition 63 (Formulas for Compatible Structures) 
For each i where 1 < i < 5, let the set of cTRi formulas be 
the set of all formulas B A F^ for B a TRi formula. 

Lemma [64] below shows that compatibility formula de- 
fines precisely the subset of compatible two-valued struc- 
tures. 
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Lemma 64 (Compatibility Formula is Correct) 

2-CSTRUCT = {S 8 S 2-STRUCT | fF^f* = 1} 

Proof. Immediate by Definition |58| and Definition |62| ■ 

As a result, we obtain the following characterization of 
the constraints expressible using cTRi formulas. 

Lemma 65 For each i where 1 < i < 5, 

mode\s[cTR t ] = {S i n 2-CSTRUCT | S 8 G models[77?;]} 

Proof. By Definition |63| and Lemma |64] ■ 

The following Corollary [66] states the desired correspon- 
dence between formulas and three-valued structures with 
defined predicates. 
Corollary 66 

models[cT 2 ] = models[cT_Ri] = models[cTi? 2 ] = 

models[cTiJ 3 ] = models[cT7? 4 ] = models[c77? 5 ] 

Proof. From Lemma |6T) Lemma [65} and Corollary |45| ■ 

6.3 Closure under Boolean Operations 

We next show that, even in the presence of defined predi- 
cates, we can reduce the entailment and the equivalence of 
constraints to the satisfiability problem. This results fol- 
lows from the closure under boolean operations. The results 
below generalize the results of Section [4. 1| 



Corollary 67 The family of sets models[cT2] forms a 
boolean algebra of sets which is a subalgebra of the boolean 
algebra of all subsets of 2-CSTRUCT. 

Proof. From Lemma |6J"| Lemma [65} and Corollary |47| ■ 

Proposition 68 There is an algorithm that constructs, 
given two finite sets of three-valued structures Si and Sz, 
a finite set of three-valued structures S3 such that: 

c7t(5i) C c^(S 2 ) iff cj^Sz) = 

Proposition 69 There is an algorithm that constructs, 
given two finite sets of three-valued structures Si and S2, 
a finite set of three-valued structures S3 such that: 

cryUSi) = ey* T (S 2 ) iff c 7 t(5 3 ) = 

6.4 Decidability Properties 

The following conditional result generalizes the idea of Sec- 
tion [5] 

Corollary 70 Let S,Si,S% range over finite sets of three- 
valued structures with defined predicates. Assume that the 
question cy^(S) = is decidable. Then the following ques- 
tions are decidable as well: 

1. c 7 t(5i) Cc7r(5 2 ); 

2. C7 t(5i)=c 7 J(5 2 ). 
Proof. By Proposition f 



land Proposition [69] 



We present an example of constraints for which the sat- 
isfiability question is decidable in [36]; other examples of 
decidable constraints can be formulated based on the tech- 
niques of logic L r of [4] or based on monadic second-order 
logic of trees which is in the heart of the graph types ap- 
proach da E3 Eol EH [m EU . 



7 Related Work 

A parametric framework for shape analysis is presented in 
|49j . A systematic presentation of three- valued logic with 
equality is given in [44] . A description of three- valued logic 
analyzer is in [3S] , an extension to interprocedural analysis is 
in [47] and the use of shape analysis for program verification 
is demonstrated in [35]. Other shape analysis techniques 
include [37l l25l IT6 l I2T ] l20l 133) [33] . 

Our paper presents a contribution to the characteriza- 
tion of heap summaries by formulas, which is a promis- 
ing direction of shape analysis that has been initiated in 
[521 1461 1351 134] . Shape analysis constraints differ from reg- 
ular graph constraints 35, 34' because shape analysis con- 
straints characterize sets of objects by defining predicates, 
instead of using existential quantification over sets of ob- 
jects. Logic L r in [4] allows specifying reachability proper- 
ties between local variables and is therefore appropriate for 
expressing certain classes of shape graphs. What L r does 
not allow is defining a set of nodes A using some predicate 
and then stating further properties of objects in the set A, 
which is one of the main expressive features of three- valued 
structures. 

Our work follows the line of shape analysis approaches 
which view program as transforming concrete graph struc- 
tures gni E3 HSl [2D [2LH S3I [331- An alternative approach 
is to identify each heap object using the set of paths that 
lead to the object [161 1231 [8]. Other notations for reasoning 
about the heap include spatial logic [1111101 1451126] and alias 
types [501 ED- 

It is possible to apply predicate abstraction techniques 
[31 [21 [22] to perform shape analysis; the view of three- valued 
structures as boolean combinations of constraints of certain 
form may be beneficial for this direction of work and enable 
easier application of representations such as binary decision 
diagrams J51I4T1I42]. 

A shape analysis tool must ultimately take into account 
the definitions of instrumentation predicates, which requires 
some form of theorem proving or decision procedures. [491 
Page 272] uses rules based on Horn clauses for such reason- 
ing, whereas [46] proposes the use of theorem provers. In 
this paper we have identified one component of the problem 
that is always decidable and useful: it is always possible to 
reduce entailment and equivalence problems to the satisfi- 
ability problem. In [36], we report a concrete example of 
constraints for which the satisfiability is decidable, the re- 
sults in the present paper then imply that the entailment 
and the equivalence are decidable as well. 

Researchers have proposed several program checking 
techniques based on dataflow analysis, symbolic execution, 
and abstract interpretation [T51 HI QS] E21 M \M ■ The 
primary strength of the shape analysis approach compared 
to the alternative approaches is the ability to perform sound 
and precise reasoning about dynamically allocated data 
structures. 

The boolean algebra of state predicates and predicate 
transformers has been used successfully as the foundation 
of refinement calculus [T] . In this paper we have identified 
a particular subalgebra of the boolean algebra of all state 
predicates; we view this boolean algebra as providing the 
foundation of shape analysis. 
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8 Conclusions 

We have characterized constraints used as dataflow facts of 
parametric shape analysis based on three-valued logic. Our 
characterization represents these dataflow facts as boolean 
combinations of formulas. The usual concretization seman- 
tics yields only positive boolean combinations. On the other 
hand, the tight concretization yields boolean shape analy- 
sis constraints, which are closed under all boolean combi- 
nations. Among the useful consequences of the closure of 
boolean shape analysis constraints under all boolean opera- 
tions is the fact that the entailment and the equivalence of 
constraints is reducible to the satisfiability of constraints. 

We view the results of this paper as a step in further un- 
derstanding of the foundations of shape analysis. To make 
the connection with 49 , this paper starts with three- valued 
structures and proceeds to characterize the structures using 
formulas. An alternative approach is to start with canonical 
formulas that express the desired properties and then ex- 
plore efficient ways of representing and manipulating these 
formulas. We believe that the entire framework [49] can 
be reformulated using canonical forms of formulas instead 
of three-valued structures. We also expect that the idea 
of viewing dataflow facts as canonical forms of formulas is 
methodologically useful in general, especially for the analy- 
ses that verify complex program properties. 
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